21 research outputs found
An automated classification system for leukocyte morphology in acute myeloid Leukemia
Diagnosis of hematological malignancies and of acute myeloid leukemia in particular have undergone wide-ranging advances in recent years, driven by an increasingly detailed knowledge of its underlying biological and genetic mechanisms. Nevertheless, cytomorphologic evaluation of samples of peripheral blood and bone marrow is still an integral part of the routine diagnostic workup. Microscopic analysis of these samples has so far defied automation and is still mainly performed by human cytologists manually classifying and counting relevant cell populations. Access to this diagnostic modality is therefore limited by the number and availability of educated cytologists. Furthermore, its results rest on judgments of examiners, which may vary according to their education and experience, rendering rigorous quantification and standardization of the method difficult.
In this thesis, an approach to cytomorphologic classification is presented that aims to harness recent advances in computational image classification for leukocyte differentiation using Deep Learning techniques that derive from the domain of Artificial Intelligence. In a first stage of the project, peripheral blood smear samples from both AML patients and controls were scanned using techniques from digital pathology. Experienced cytologists from the Laboratory of Leukemia Diagnostics at the LMU Klinikum annotated the digitized samples according to a scheme of 15 morphological categories derived from standard routine diagnostics. The resulting set of over 18,000 annotated single-cell images is the largest public database of leukocyte morphologies in leukemia available today.
In a second step, the compiled dataset was used to develop a neural network that is able to classify leukocytes into the standard morphological scheme. Evaluation of network predictions show that the network performs well at the classification task for most clinically relevant categories, with an error pattern similar to that of human examiners. The network can also be employed to answer two questions of immediate clinical relevance, namely if a given single-cell image shows a blast-like cell, or if it belongs to the set of atypical cells which are not present in peripheral blood smears under physiological conditions. At these questions, the network is found to show similar and slightly better performance compared to the human examiner. These results show the potential of Deep Learning techniques in the field of hematological diagnostics and suggest avenues for their further development as a helpful tool of leukemia diagnostics.In der Diagnostik hämatologischer Erkrankungen wie der akuten myeloischen Leukämie haben sich in den vergangenen Jahren bedeutende Fortschritte ergeben, die vor allem auf einem vertieften Verständnis ihrer biologischen und genetischen Ursachen beruhen. Trotzdem spielt die zytomorphologische Untersuchung von Blut- und Knochenmarkspräparaten nach wie vor eine zentrale Rolle in der diagnostischen Aufarbeitung. Die mikroskopische Begutachtung dieser Präparate konnte bisher nicht automatisiert werden und erfolgt nach wie vor durch menschliche Befunder, die eine manuelle Differentierung und Auszählung relevanter Zelltypen vornehmen. Daher ist der Zugang zu zytomorphologischen Untersuchungen durch die Zahl verfügbarer zytologischer Befunder begrenzt. Darüber hinaus beruht die Beurteilung der Präparate auf der individuellen Einschätzung der Befunder und ist somit von deren Ausbildung und Erfahrung abhängig, was eine standardisierte und quantitative Auswertung der Morphologie zusätzlich erschwert.
Ziel der vorliegenden Arbeit ist es, ein computerbasiertes System zu entwickeln, die die morphologische Differenzierung von Leukozyten unterstützt. Zu diesem Zweck wird auf in den letzten Jahren entwickelte leistungsfähige Algorithmen aus dem Bereich der Künstlichen Intelligenz, insbesondere des sogenannten Tiefen Lernens zurückgegriffen. In einem ersten Schritt des Projekts wurden periphere Blutausstriche von AML-Patienten und Kontrollen mit Methoden der digitalen Pathologie erfasst. Erfahrene Befunder aus dem Labor für Leukämiediagnostik am LMU-Klinikum München annotierten die digitalisierten Präparate und differenzierten sie in ein 15-klassiges, aus der Routinediagnostik stammendes Standardschema. Auf diese Weise wurde mit über 18,000 morphologisch annotierten Leukozyten der aktuell größte öffentlich verfügbare Datensatz relevanter Einzelzellbilder zusammengestellt.
In einer zweiten Phase des Projekts wurde dieser Datensatz verwendet, um Algorithmen vom Typ neuronaler Faltungsnetze zur Klassifikation von Einzelzellbilden zu trainieren. Eine Analyse ihrer Vorhersagen zeigt dass diese Netzwerke Einzelzellbilder der meisten Zellklassen sehr erfolgreich differenzieren können. Für falsch klassifizierte Bilder ähnelt ihr Fehlermuster dem menschlicher Befunder. Neben der Klassifikation einzelner Zellen erlauben die Netzwerke auch die Beantwortung gröberer, binärer Fragestellungen, etwa ob eine bestimmte Zelle blastären Charakter hat oder zu den morphologischen Klassen gehört die in einem peripheren Blutausstrich nicht unter physiologischen Bedingungen vorkommen. Bei diesen Fragen zeigen die Netzwerke eine ähnliche und leicht bessere Leistung als der menschliche Befunder. Die Ergebnisse dieser Arbeit illustrieren das Potential von Methoden der künstlichen Intelligenz auf dem Gebiet der Hämatologie und eröffnen Möglichkeiten zu ihrer Weiterentwicklung zu einem praktischen Hilfsmittel der Leukämiediagnostik
Coarse-grained modelling of supercoiled RNA
We study the behaviour of double-stranded RNA under twist and tension using
oxRNA, a recently developed coarse-grained model of RNA. Introducing explicit
salt-dependence into the model allows us to directly compare our results to
data from recent single-molecule experiments. The model reproduces extension
curves as a function of twist and stretching force, including the buckling
transition and the behaviour of plectoneme structures. For negative
supercoiling, we predict denaturation bubble formation in plectoneme end-loops,
suggesting preferential plectoneme localisation in weak base sequences. OxRNA
exhibits a positive twist-stretch coupling constant, in agreement with recent
experimental observations.Comment: 8 pages + 5 pages Supplementary Materia
Long-range correlations in the mechanics of small DNA circles under topological stress revealed by multi-scale simulation
It is well established that gene regulation can be achieved through activator and repressor proteins that bind to DNA and switch particular genes on or off, and that complex metabolic networks deter- mine the levels of transcription of a given gene at a given time. Using three complementary computa- tional techniques to study the sequence-dependence of DNA denaturation within DNA minicircles, we have observed that whenever the ends of the DNA are con- strained, information can be transferred over long distances directly by the transmission of mechanical stress through the DNA itself, without any require- ment for external signalling factors. Our models com- bine atomistic molecular dynamics (MD) with coarse- grained simulations and statistical mechanical calcu- lations to span three distinct spatial resolutions and timescale regimes. While they give a consensus view of the non-locality of sequence-dependent denatura- tion in highly bent and supercoiled DNA loops, each also reveals a unique aspect of long-range informa- tional transfer that occurs as a result of restraining the DNA within the closed loop of the minicircles
Data Models for Dataset Drift Controls in Machine Learning With Images
Camera images are ubiquitous in machine learning research. They also play a
central role in the delivery of important services spanning medicine and
environmental surveying. However, the application of machine learning models in
these domains has been limited because of robustness concerns. A primary
failure mode are performance drops due to differences between the training and
deployment data. While there are methods to prospectively validate the
robustness of machine learning models to such dataset drifts, existing
approaches do not account for explicit models of the primary object of
interest: the data. This makes it difficult to create physically faithful drift
test cases or to provide specifications of data models that should be avoided
when deploying a machine learning model. In this study, we demonstrate how
these shortcomings can be overcome by pairing machine learning robustness
validation with physical optics. We examine the role raw sensor data and
differentiable data models can play in controlling performance risks related to
image dataset drift. The findings are distilled into three applications. First,
drift synthesis enables the controlled generation of physically faithful drift
test cases. The experiments presented here show that the average decrease in
model performance is ten to four times less severe than under post-hoc
augmentation testing. Second, the gradient connection between task and data
models allows for drift forensics that can be used to specify
performance-sensitive data models which should be avoided during deployment of
a machine learning model. Third, drift adjustment opens up the possibility for
processing adjustments in the face of drift. This can lead to speed up and
stabilization of classifier training at a margin of up to 20% in validation
accuracy. A guide to access the open code and datasets is available at
https://github.com/aiaudit-org/raw2logit.Comment: LO and MA contributed equall
Data models for dataset drift controls in machine learning with optical images
Camera images are ubiquitous in machine learning research. They also play a central role
in the delivery of important public services spanning medicine or environmental surveying.
However, the application of machine learning models in these domains has been limited
because of robustness concerns. A primary failure mode are performance drops due to
differences between the training and deployment data. While there are methods to prospectively validate the robustness of machine learning models to such dataset drifts, existing
approaches do not account for explicit models of machine learning’s primary object of interest:
the data. This limits our ability to study and understand the relationship between data
generation and downstream machine learning model performance in a physically accurate
manner. In this study, we demonstrate how to overcome this limitation by pairing traditional
machine learning with physical optics to obtain explicit and differentiable data models. We
demonstrate how such data models can be constructed for image data and used to control
downstream machine learning model performance related to dataset drift. The findings
are distilled into three applications. First, drift synthesis enables the controlled generation
of physically faithful drift test cases to power model selection and targeted generalization.
Second, the gradient connection between machine learning task model and data model allows
advanced, precise tolerancing of task model sensitivity to changes in the data generation.
These drift forensics can be used to precisely specify the acceptable data environments
in which a task model may be run. Third, drift optimization opens up the possibility to
create drifts that can help the task model learn better faster, effectively optimizing the
data generating process itself to support the downstream machine vision task. This is an
interesting upgrade to existing imaging pipelines which traditionally have been optimized to
be consumed by human users but not machine learning models. The data models require
access to raw sensor images as commonly processed at scale in industry domains such as
microscopy, biomedicine, autonomous vehicles or remote sensing. Alongside the data model
code we release two datasets to the public that we collected as part of this work. In total,
the two datasets, Raw-Microscopy and Raw-Drone, comprise 1,488 scientifically calibrated
reference raw sensor measurements, 8,928 raw intensity variations as well as 17,856 images
processed through twelve data models with different configurations. A guide to access the
open code and datasets is available at https://github.com/aiaudit-org/raw2logit
DNA cruciform arms nucleate through a correlated but non-synchronous cooperative mechanism
Inverted repeat (IR) sequences in DNA can form non-canonical cruciform
structures to relieve torsional stress. We use Monte Carlo simulations of a
recently developed coarse-grained model of DNA to demonstrate that the
nucleation of a cruciform can proceed through a cooperative mechanism. Firstly,
a twist-induced denaturation bubble must diffuse so that its midpoint is near
the centre of symmetry of the IR sequence. Secondly, bubble fluctuations must
be large enough to allow one of the arms to form a small number of hairpin
bonds. Once the first arm is partially formed, the second arm can rapidly grow
to a similar size. Because bubbles can twist back on themselves, they need
considerably fewer bases to resolve torsional stress than the final cruciform
state does. The initially stabilised cruciform therefore continues to grow,
which typically proceeds synchronously, reminiscent of the S-type mechanism of
cruciform formation. By using umbrella sampling techniques we calculate, for
different temperatures and superhelical densities, the free energy as a
function of the number of bonds in each cruciform along the correlated but
non-synchronous nucleation pathways we observed in direct simulations.Comment: 12 pages main paper + 11 pages supplementary dat
Statistical mechanics of nucleic acids under mechanical stress
In this thesis, the response of DNA and RNA to linear and torsional mechanical stress is studied using coarse-grained models. Inspired by single-molecule assays developed over the last two decades, the end-to-end extension, buckling and torque response behaviour of the stressed molecules is probed under conditions similar to experimentally used setups. Direct comparison with experimental data yields excellent agreement for many conditions. Results from coarse-grained simulations are also compared to the predictions of continuum models of linear polymer elasticity. A state diagram for supercoiled DNA as a function of twist and tension is determined. A novel confomational state of mechanically stressed DNA is proposed, consisting of a plectonemic structure with a denaturation bubble localized in its end-loop. The interconversion between this novel state and other, known structural motifs of supercoiled DNA is studied in detail. In particular, the influence of sequence properties on the novel state is investigated. Several possible implications for supercoiled DNA structures in vivo are discussed. Furthermore, the dynamical consequences of coupled denaturation and writhing are studied, and used to explain observations from recent single molecule experiments of DNA strand dynamics. Finally, the denaturation behaviour, topology and dynamics of short DNA minicircles is studies using coarse-grained simulations. Long-range interactions in the denaturation behaviour of the system are observed. These are induced by the topology of the system, and are consistent with results from recent molecular imaging studies. The results from coarse-grained simulations are related to modelling of the same system in all-atom simulations and a local denaturation model of DNA, yielding insight into the applicability of these different modelling approaches to study different processes in nucleic acids
Statistical mechanics of nucleic acids under mechanical stress
In this thesis, the response of DNA and RNA to linear and torsional mechanical stress is studied using coarse-grained models. Inspired by single-molecule assays developed over the last two decades, the end-to-end extension, buckling and torque response behaviour of the stressed molecules is probed under conditions similar to experimentally used setups. Direct comparison with experimental data yields excellent agreement for many conditions. Results from coarse-grained simulations are also compared to the predictions of continuum models of linear polymer elasticity.
A state diagram for supercoiled DNA as a function of twist and tension is determined. A novel confomational state of mechanically stressed DNA is proposed, consisting of a plectonemic structure with a denaturation bubble localized in its end-loop. The interconversion between this novel state and other, known structural motifs of supercoiled DNA is studied in detail. In particular, the influence of sequence properties on the novel state is investigated. Several possible implications for supercoiled DNA structures in vivo are discussed. Furthermore, the dynamical consequences of coupled denaturation and writhing are studied, and used to explain observations from recent single molecule experiments of DNA strand dynamics.
Finally, the denaturation behaviour, topology and dynamics of short DNA minicircles is studies using coarse-grained simulations. Long-range interactions in the denaturation behaviour of the system are observed. These are induced by the topology of the system, and are consistent with results from recent molecular imaging studies. The results from coarse-grained simulations are related to modelling of the same system in all-atom simulations and a local denaturation model of DNA, yielding insight into the applicability of these different modelling approaches to study different processes in nucleic acids.This thesis is not currently available on ORA